38 research outputs found
HEALNet -- Hybrid Multi-Modal Fusion for Heterogeneous Biomedical Data
Technological advances in medical data collection such as high-resolution
histopathology and high-throughput genomic sequencing have contributed to the
rising requirement for multi-modal biomedical modelling, specifically for
image, tabular, and graph data. Most multi-modal deep learning approaches use
modality-specific architectures that are trained separately and cannot capture
the crucial cross-modal information that motivates the integration of different
data sources. This paper presents the Hybrid Early-fusion Attention Learning
Network (HEALNet): a flexible multi-modal fusion architecture, which a)
preserves modality-specific structural information, b) captures the cross-modal
interactions and structural information in a shared latent space, c) can
effectively handle missing modalities during training and inference, and d)
enables intuitive model inspection by learning on the raw data input instead of
opaque embeddings. We conduct multi-modal survival analysis on Whole Slide
Images and Multi-omic data on four cancer cohorts of The Cancer Genome Atlas
(TCGA). HEALNet achieves state-of-the-art performance, substantially improving
over both uni-modal and recent multi-modal baselines, whilst being robust in
scenarios with missing modalities.Comment: 7 pages body, 5 pages appendi
Constraining Variational Inference with Geometric Jensen-Shannon Divergence.
We examine the problem of controlling divergences for latent space
regularisation in variational autoencoders. Specifically, when aiming to
reconstruct example via latent space
(), while balancing this against the need for generalisable latent
representations. We present a regularisation mechanism based on the
skew-geometric Jensen-Shannon divergence
. We find a variation in
, motivated by limiting cases, which leads
to an intuitive interpolation between forward and reverse KL in the space of
both distributions and divergences. We motivate its potential benefits for VAEs
through low-dimensional examples, before presenting quantitative and
qualitative results. Our experiments demonstrate that skewing our variant of
, in the context of
-VAEs, leads to better reconstruction and
generation when compared to several baseline VAEs. Our approach is entirely
unsupervised and utilises only one hyperparameter which can be easily
interpreted in latent space.Comment: Camera-ready version, accepted at NeurIPS 202
GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data
Neural network models often struggle with high-dimensional but small
sample-size tabular datasets. One reason is that current weight initialisation
methods assume independence between weights, which can be problematic when
there are insufficient samples to estimate the model's parameters accurately.
In such small data scenarios, leveraging additional structures can improve the
model's training stability and performance. To address this, we propose
GCondNet, a general approach to enhance neural networks by leveraging implicit
structures present in tabular data. We create a graph between samples for each
data dimension, and utilise Graph Neural Networks (GNNs) for extracting this
implicit structure, and for conditioning the parameters of the first layer of
an underlying predictor MLP network. By creating many small graphs, GCondNet
exploits the data's high-dimensionality, and thus improves the performance of
an underlying predictor network. We demonstrate the effectiveness of our method
on nine real-world datasets, where GCondNet outperforms 14 standard and
state-of-the-art methods. The results show that GCondNet is robust and can be
applied to any small sample-size and high-dimensional tabular learning task.Comment: Early version presented at the 17th Machine Learning in Computational
Biology (MLCB) meeting, 202
In-Domain Self-Supervised Learning Can Lead to Improvements in Remote Sensing Image Classification
Self-supervised learning (SSL) has emerged as a promising approach for remote
sensing image classification due to its ability to leverage large amounts of
unlabeled data. In contrast to traditional supervised learning, SSL aims to
learn representations of data without the need for explicit labels. This is
achieved by formulating auxiliary tasks that can be used to create
pseudo-labels for the unlabeled data and learn pre-trained models. The
pre-trained models can then be fine-tuned on downstream tasks such as remote
sensing image scene classification. The paper analyzes the effectiveness of SSL
pre-training using Million AID - a large unlabeled remote sensing dataset on
various remote sensing image scene classification datasets as downstream tasks.
More specifically, we evaluate the effectiveness of SSL pre-training using the
iBOT framework coupled with Vision transformers (ViT) in contrast to supervised
pre-training of ViT using the ImageNet dataset. The comprehensive experimental
work across 14 datasets with diverse properties reveals that in-domain SSL
leads to improved predictive performance of models compared to the supervised
counterparts
Current Trends in Deep Learning for Earth Observation: An Open-source Benchmark Arena for Image Classification
We present 'AiTLAS: Benchmark Arena' -- an open-source benchmark framework
for evaluating state-of-the-art deep learning approaches for image
classification in Earth Observation (EO). To this end, we present a
comprehensive comparative analysis of more than 400 models derived from nine
different state-of-the-art architectures, and compare them to a variety of
multi-class and multi-label classification tasks from 22 datasets with
different sizes and properties. In addition to models trained entirely on these
datasets, we also benchmark models trained in the context of transfer learning,
leveraging pre-trained model variants, as it is typically performed in
practice. All presented approaches are general and can be easily extended to
many other remote sensing image classification tasks not considered in this
study. To ensure reproducibility and facilitate better usability and further
developments, all of the experimental resources including the trained models,
model configurations and processing details of the datasets (with their
corresponding splits used for training and evaluating the models) are publicly
available on the repository: https://github.com/biasvariancelabs/aitlas-arena
Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs
Variational Autoencoders and their many variants have displayed impressive
ability to perform dimensionality reduction, often achieving state-of-the-art
performance. Many current methods however, struggle to learn good
representations in High Dimensional, Low Sample Size (HDLSS) tasks, which is an
inherently challenging setting. We address this challenge by using an ensemble
of lightweight VAEs to learn posteriors over subsets of the feature-space,
which get aggregated into a joint posterior in a novel divide-and-conquer
approach. Specifically, we present an alternative factorisation of the joint
posterior that induces a form of implicit data augmentation that yields greater
sample efficiency. Through a series of experiments on eight real-world
datasets, we show that our method learns better latent representations in HDLSS
settings, which leads to higher accuracy in a downstream classification task.
Furthermore, we verify that our approach has a positive effect on
disentanglement and achieves a lower estimated Total Correlation on learnt
representations. Finally, we show that our approach is robust to partial
features at inference, exhibiting little performance degradation even with most
features missing